fix: correct cache_hit_rate calculation and fix Vercel stream tool call handling by sestinj · Pull Request #10994 · continuedev/continue

sestinj · 2026-03-03T05:01:42Z

Summary

Fix cache_hit_rate telemetry: The prompt_cache_metrics event was emitted twice per completion, and the cache hit rate denominator used only prompt_tokens (which maps to Anthropic's input_tokens — non-cached only). This caused ratios >> 1 when caching worked well (max observed: 89,892). Fixed by removing the duplicate emission and using the correct total: prompt_tokens + cache_read_tokens + cache_write_tokens.
Fix Vercel AI SDK tool call streaming: The Vercel AI SDK streams tool calls as tool-input-start → tool-input-delta → tool-input-end → tool-call. Previously tool-input-start was ignored and tool-call emitted the full call at the end, so streaming consumers never saw the tool call id on intermediate chunks. Now tool-input-start emits the initial chunk with id and function name (matching OpenAI's streaming format), and tool-call is a no-op to avoid duplicating args.

Test plan

Unit tests updated and passing for vercelStreamConverter.test.ts (15 tests)
Vercel SDK integration tests should now pass in CI (locally blocked by missing @ai-sdk/xai dep, env-only issue)

Two bugs in prompt_cache_metrics telemetry: 1. Duplicate emission: prompt_cache_metrics was emitted twice per API request — once using `actualInputTokens` and again using `fullUsage.prompt_tokens`. This doubled all event counts in PostHog and produced conflicting values. 2. Wrong denominator: cache_hit_rate was calculated as `cacheReadTokens / prompt_tokens`, but the Anthropic adapter maps `prompt_tokens` to only non-cached input tokens (`input_tokens`), excluding cache reads and writes. When caching works well, this produces ratios >> 1 (observed max: 89,892). The correct total is `prompt_tokens + cache_read_tokens + cache_write_tokens`. Fix: remove the first duplicate emission and compute total_prompt_tokens as the sum of all three token types. cache_hit_rate is now a proper 0-1 ratio. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

continue · 2026-03-03T05:02:27Z

Docs Review: No documentation updates needed.

This PR contains internal telemetry fixes (correcting cache_hit_rate calculation and removing duplicate event emission) that don't affect user-facing features, configuration options, or developer workflows. The changes are purely internal to Continue's analytics infrastructure.

cubic-dev-ai

No issues found across 1 file

…nverter The Vercel AI SDK streams tool calls as tool-input-start → tool-input-delta → tool-input-end → tool-call. Previously, tool-input-start was ignored (returned null) and tool-call emitted the full tool call at the end, which meant streaming consumers never saw the tool call id on intermediate chunks. Now tool-input-start emits the initial chunk with id and function name (matching OpenAI's streaming format), and tool-call returns null to avoid duplicating args already streamed via tool-input-delta. Generated with [Continue](https://continue.dev) Co-Authored-By: Continue <noreply@continue.dev>

sestinj requested a review from a team as a code owner March 3, 2026 05:01

sestinj requested review from RomneyDa and removed request for a team March 3, 2026 05:01

github-project-automation bot added this to Issues and PRs Mar 3, 2026

github-project-automation bot moved this to Todo in Issues and PRs Mar 3, 2026

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 3, 2026

cubic-dev-ai bot reviewed Mar 3, 2026

View reviewed changes

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 3, 2026

sestinj changed the title ~~fix: correct cache_hit_rate calculation and remove duplicate emission~~ fix: correct cache_hit_rate calculation and fix Vercel stream tool call handling Mar 3, 2026

sestinj merged commit ec7030d into main Mar 4, 2026
59 of 60 checks passed

sestinj deleted the nate/fix-cache-hit-rate-telemetry branch March 4, 2026 15:57

github-project-automation bot moved this from Todo to Done in Issues and PRs Mar 4, 2026

github-actions bot locked and limited conversation to collaborators Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct cache_hit_rate calculation and fix Vercel stream tool call handling#10994

fix: correct cache_hit_rate calculation and fix Vercel stream tool call handling#10994
sestinj merged 2 commits intomainfrom
nate/fix-cache-hit-rate-telemetry

sestinj commented Mar 3, 2026 •

edited

Loading

Uh oh!

continue bot commented Mar 3, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sestinj commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

continue bot commented Mar 3, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sestinj commented Mar 3, 2026 •

edited

Loading